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Abstract 

The budding yeast Saccharomyces cerevisiae is the first eukaryote whose genome has been com- 
pletely sequenced. It is also the first eukaryotic cell whose proteome (the set of all proteins) and 
interactome (the network of all mutual interactions between proteins) has been analyzed. In this 
paper we study the structure of the yeast protein complex network in which weighted edges be- 
tween complexes represent the number of shared proteins. It is found that the network of protein 
complexes is a small world network with scale free behavior for many of its distributions. How- 
ever we find that there are no strong correlations between the weights and degrees of neighboring 
complexes. To reveal non-random features of the network we also compare it with a null model 
in which the complexes randomly select their proteins. Finally we propose a simple evolutionary 
model based on duplication and divergence of proteins. 
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I. INTRODUCTION 



In recent years complex networks have attracted much of interest to model real-life net- 
works such as social, biological and communication networks . Certainly, introducing 
the essential static and dynamic features of these networks can help us in a better under- 
standing of their various properties S, 0, 1 0] • A well known property of most of these 
networks, the so called small world property indicates that the average distance between 
any two nodes increases slowly with the size of the network (i.e. as logarithm of the size). 
This in turn can lead to a fast spreading of effects in the network and so increases the finite 
size effects when one studies for example diffusion in such a network 10]. Extensive studies 
also indicate the importance of degree (the number of neighbors of a node) distribution 
for static and dynamic behaviors of the network . One can also add various kinds of 
correlations, e.g. degree correlation of two neighbors, to the list of these important features 
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121. 



Protein interaction networks are important examples of the real-life networks in which nodes 



and edges represent proteins and interactions between them respectively |13l . 114 , 



18 



15 



17 



igj ]. Proteins have been traditionally recognized on the basis of their roles as enzymes, 
signalling molecules or structural components in cells and micro-organisms. The most rudi- 
mentary structural information about the proteome (assembly of proteins in an organism) 
is the pattern of interactions between different proteins. Determining such connections, 
helps us in understanding the backbone of functional relationships between proteins and 
the pathways for the propagation of various signals among them. Besides specific detailed 
information about the pattern of interactions in a single proteome, which are certainly im- 
portant for its functioning, some general characteristics of these networks are also important 
in that they may point to universal properties of organisms. For example, it has been shown 
that as far as the interaction of individual proteins are concerned, the interaction network 
of the budding yeast Saccharomyses cerevisiae {S. cerevisiae) is a scale free network. This 
property is itself a hint to the robustness of the protein interaction network against the 



random removal of proteins 



However recent progress indicates that each of the central processes in a cell is catalyzed not 
by a single protein but b y t he coordinated action of a highly linked set of several proteins, 



called a protein complex j21 



221 l23j . Thus complexes act as protein machines and have been 
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evolved for the same reason that humans have invented mechanical and electronic machines. 
It is also remarkable that a single protein may be shared in several different complexes. It is 
expected that a protein with a general functionality is shared by many complexes. On the 
other hand it seems that the weight of a complex depends on the degree of sophistication of 
its tasks. 

It is in view of this new emerging picture of the proteome as a coordinated ensemble of 
protein complexes that we study the general properties of the network of protein complexes 
of the budding yeast. Certainly, analysis of the proteome map at both protein and complex 
level will result in a better understanding of the proteom functioning. Although from a 
biological point of view the precise nature of proteins and their interactions in a single living 
organism are important, from a physical point of view in which we seek the general universal 
patterns among many different organism [2^ , we can study the most elementary features of 
a proteome, i.e. the weight distribution of complexes, distribution of the number of proteins 
shared between two complexes, etc. 

Based on extensive information provided in we constructed a weighted graph corre- 
sponding to the network of complexes. By this we mean that both the nodes and the edges 
are assigned weights. The weight of a node is equal to the number of proteins in the com- 
plex it represents. Two nodes are connected by a weighted edge, whose weight indicates how 
many proteins are common in the two complexes. Such a point of view seems to be the first 
step to understand the integration and coordination of cellular functions. Connections in 
this network not only reflect physical interaction of coraplexes, but may also represent com- 
mon regulation, localization, turnover or architecture [23]. In addition it has a meaningful 
interpretation in other real-life networks such as social networks when a connection between 
two communities can only be established by common individuals in them. Note that one 
may consider a single protein complex as a subgraph of the protein interaction network with 
a high level of interconnection between its elements. From this point of view the protein 
complex network is a large scale or coarse grained picture of the protein interaction network. 
However, we stress that in view of the recent findings, this is not the correct picture for the 
interactome. In fact the important feature of the recent experiments 0, [2^ compared to 
the previous experiments Q| is that they uncover not only the pairwise interactions but 
ternary, quaternary and higher interactions between different proteins in a complex j^]. 
In this work we show that the above protein complex network is a small world one with 
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scale free degree distribution. Moreover it is found that distributions of weights of com- 
plexes, weights of edges and coordination numbers of proteins (the number of complexes a 
protein participates) follow power law behaviors which can in turn refer to a kind of pref- 
erential attachment in the evolution of the protein complex network jjlj. We compare the 
network with some null models such as Erdos and Renyi random graph and a random- 
selection model in which each complex selects its proteins randomly from the list of all 
proteins. Unlike the former case the latter model results in a considerable clustering for 
the network of complexes. Here by clustering we mean the probability that two neighbors 
of a node be also connected to each other. In this manner the clustering is equivalent to 
the transitivity of the network (defined as three times the ratio of the number of triangles 
in the network to the number of connected triples of nodes ^27] ). Moreover, we find that 
the random-selection model can well reproduce the degree distribution of complexes and the 
dependence of degree of a complex on its weight. However some important distributions of 
the network such as weight of edges are still far from the predictions of this model in the 
region of large weights. Following the previous models for the evolution of the protein inter- 
12, we propose a simple evolutionary model based on duplication 



action network [If 



and divergence (mutation) of proteins. We show that this model reproduces the power law 
behavior of some essential distributions such as weight of complexes and weight of edges. 

The paper is organized as follows. Section (jH)) is devoted to the description of the budding 
yeast protein complex network based on the data provided in Section piljl gives a 

comparison between the random-selection model and the real network. The evolutionary 
model is introduced in section pVjl . We conclude the paper in section (|V]l . 



II. STRUCTURAL PROPERTIES OF THE YEAST PROTEIN COMPLEX NET- 
WORK 



23, the S. cerevisiae includes n = 1398 



According to the data we have extracted from 
proteins, organized in = 232 complexes. The resulting network is then composed of 232 
nodes and E = 2043 edges. A complex of weight m is denoted by a node of weight m, 
and if two complexes share w proteins, a weight w is assigned to the edge connecting them. 
Figures (^-a) and (^-b) show the distribution of weights of the nodes, S{m), and weights 
of the edges, E{w). Both distributions show the power law behavior {S{m) ~ m"^™ and 
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FIG. 1: (a) Weight distribution of complexes and (b) weight distribution of edges in the yeast 
protein complex network. Lines in both figures show fitted power laws with the exponents given 
in the text. 



E{w) 



w 



with exponents respectively equal to = 1.33 ± 0.06 and Ty, = 2.4 ± 0.1, 



figures (P-a and^b). The observed deviation in the number of complexes of weight 1 from 
ihe expected power law behavior could be mostly attributed to the experimental limitations 
2^ . The average weight of the nodes and the edges turn out to be respectively fn = 11.48 
and w = 1.79 with the following dispersions: am '■= = 14.69 and = 2.9. One 

could attribute the large values of these dispersions to the small size of the network and 
scale free nature of the related distributions. A curious property of the network is that there 
is no correlation between the weights of adjacent nodes. In other words, the probability that 
an emanating edge from a complex of weight m encounters another complex of weight m' is 



independent of m. To show this we compute the associated correlation coefficient 



ll|,r. 
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which in the protein complex network has the value —0.004 and is defined as follows: let 
P(m, m') be the probability that an arbitrary edge lies between two complexes of weights 
m and m' . Thus 11 (m) = X^m' gives the probability of finding a complex of weight 

m at the end point of an arbitrary edge. Now the correlation coefficient is given by 

^ Em,m' mm'jPjm, m') - U{m)Il{m')) 

It is clear that in the absence of any correlation (i.e. when P{m,m') = n(m)n(m')) we 

have Tmrn = 0. If similar complexes have a high tendency for being connected to each other 

I I 

we say that the network is assortative [11] in this respect and the correlation coefficient will 
be positive. Otherwise the network is dissortative and the correlation coefficient will be 
negative. Obviously one can apply this definition to any two point distribution to measure 
the degree of correlation between the two variables. 



One may also ask what is the relation between the degree of a complex and its weight. 
Our results again give a power law dependence for large values of complex weights. The 
total weight of edges emanating from a complex of weight m (its weighted degree) scales 
as kyj{m) oc m^™, where [3^ — 0.95 ± 0.07, figure ^a). It is found that the number of 
neighbors behaves in a similar way k{m) oc with (3 ~ 0.55 ±0.07, figure (j21-b). Obviously 
kw{m) grows faster than k{m) with the weight of complex as expected. Roughly speaking, 
since [3yj ~ 2/5, the above relations suggest that the average weight of an edge emanating 
from a complex of weight m scales as m'^. 



To address topological properties of the graph we studied also the degree distribution, 
P{k) and the correlation between degrees of neighboring nodes which can again be detected 
by computing the related correlation coefficient, Tkk- Note that by degree of a node we mean 
the number of neighboring nodes, irrespective of the weights of the edges emanating from 
that node. By weighted degree of a node we mean the sum of weights of the edges emanating 
from this node. In figure Q we show that the degree distribution is scale free with a sharp 
cutoff at its tail. In this figure we see also a power law fit of form P{k) ~ k~^ with exponent 
7 = 0.6 ±0.04 to the real data. Moreover, we find the value —0.06 for rkk which means that 
in contrast to the protein interaction networks there is no strong correlation between 
the degrees of adjacent nodes in protein complex network. Our results shows that the same 
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FIG. 2: Dependence of (a) weighted degree and (b) degree of a complex on its weight. 

conclusion is true if we take also the weights of the edges into account. 

Usually a given protein takes part in more than one, say q complexes. We call q the 
coordination number of that protein. We found that on average a protein takes part in 
q = 1.91 complexes (with = 1.86). Figure Q shows the number of proteins versus 
their coordination numbers, R{q). The distribution is a scale free one for large coordination 
numbers, that is R{q) ~ q~^'' with Tg = 2.95 ± 0.12 as its exponent. This indicates that 
there are a few number of proteins with high coordination numbers. It seems that these are 
the proteins with exceedingly important role in the functioning of the cell. Let us compute 
the correlation coefficient which measures the correlation between coordination of proteins 
and weight of complexes they contribute. To this end we consider the collection of proteins 
and complexes as a bipartite network in which nodes of the first type represent complexes 
and those of the second type represent proteins. An edge in this network connects a node 
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FIG. 4: Coordination number distribution of proteins in the yeast protein complex network. 

of the first kind to a node of the second kind. Thus the number of edges emanating from a 
protein determines its coordination number and similarly the number of edges connected to a 
complex gives its weight. Now we define P(m, q) as the probability that an arbitrary edge of 
this bipartite network connects a complex of weight m to a protein of coordination number q. 
Therefor n(m) := J2q P{m, q) and 7r(g) := P{'>tT', q) are respectively the probability that 
an arbitrary edge reaches to a complex of weight m and a protein of coordination number 
q. Now the associated correlation coefficient is defined as 

._ Em,g mq{P{m, q) - Il{m)Ti{q)) 

"^mq ■ — j I \^) 

VSm^^n(m) - (Em"^n(m))YEqg^7r(g) - (Eqg7r(g))2 
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We find that in the the case of our data Vmq = 0.024 which indicates to the absence 
of correlation in this respect. That is the coordination number of a protein does not af- 
fects in its relation with complexes of different weights more than what is expected by chance. 



The network of complexes also defines pathways for the propagation of various signals 
such as phosphorylation and allosteric regulation of proteins. For such functions, a key 
parameter of the network is its diameter, defined as the shortest path between the remotest 
nodes in the giant component of the network. Our analysis reveals that the network has a 
small diameter, D = 5, which points to the small world property of the network. To confirm 



this we also computed l/dij (where dij denotes the shortest distance between nodes i and 
j) and compared it with the corresponding quantity in an equivalent Erdos - Renyi random 



graph, table (jl}. The reason behind the computation of l/rfjj- is that the above graph is 
not connected rather it has a giant component of size Sg = 198, a binary component and 32 
single nodes. To have a measure of clustering in the network, its transitivity T was extracted 
and it has been found that it is almost six times greater than the one in an Erdos - Renyi 
random graph, see table (jl}. This high level of transitivity which is another hint to the 
small world property of the protein complex network woul d g ive rise to the robustness of 
the network against the random removal of nodes or edges 



III. THE RANDOM-SELECTION MODEL 



To highlight the special features of the yeast protein complex network and to possibly 
draw conclusions of biological interest, one may compare it to a random network in which 
proteins aggregate randomly to form different complexes. We call such a model a random- 
selection model. The simplest such model may be constructed as a bipartite network |3| 
as follows: one takes a bipartite network consisting of two types of nodes. Nodes of the 
first type represent complexes and those of the second type represent proteins. A bipartite 
network of this kind consisting of complexes and n proteins and the resulting weighted 
network consisting of only protein complexes are depicted in figure (0). Here we start from 
a bipartite network and calculate many of the properties of the resulting protein complex 
network exactly. A given complex /i contains a number of proteins which we denote by m^. In 
general < < n. From the real data one can infer the actual sequence (mi, m2, • ■ ■ rriN)- 
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FIG. 5: (a) A bipartite graph of N complexes and n proteins. Connections can only be established 
between proteins and complexes, (b) the resulted weighted graph of protein complexes. The 
number of lines connecting two complexes represents the number of shared proteins. 



One thus assigns free (unconnected) stubs to the first type of nodes (the complexes) according 
to this sequence and then connects these stubs randomly to the second types of nodes 
(proteins). Each protein connected to a complex means that it is contained in that complex. 
In this way one obtains another distribution R{q) , where q is the number of complexes a given 
protein participates in. Clearly a given protein may be contained in more than one complex. 
Note that a peculiar feature of this model is that certain proteins may not be connected to 
any complex at all. Later in this section we introduce another slightly improved model in 
which this effect does not happen. 

In the following we calculate many of the interesting quantities of the resulting weighted 
network of complexes exactly and compare them with real data to check the viability of 
this model. To proceed let S{m) denotes the number of complexes of weight m. Obviously 
X^^5'(m) = N, where N is the total number of complexes. First let us calculate the 
probability of two complexes of weight m and m' to have w proteins in common. Denote 
this probability by P{m,m']w). The first complex chooses its m members freely from the 



collection of all proteins. The second complex, has 



ways for choosing its members 



I 



m 



from the collection, of which 



ways are available for choosing w members in common 



V 



w 
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n — m 

from the first complex and ways are available for choosing the remaining set 



m' — w 



disjoint from the first complex. Hence the probability is 







^ n — m \ 






^m' — w J 



P{m,m';w) = — ^—r-^ — r — (3) 



m' 



. This equation can be rewritten in the form 

, , m\m'\ in — m)\(n — m')\ 

P{m, m'- w) = -— \ 4 

[m — w)!(m' — w)\ n\w\[n + w — m — m')l 

to make its symmetry under interchange of m and m' manifest. The probability that two 
such complexes have no common members (be not connected to each other in the network) 
is: 

/ a\ {n - m)\{n - m')\ 

P{m,m;0) = — — . (5 

n\\n — m — m')\ 

For m,m' << n one can approximates {n — m)\/n\ by and (n — m')!/ [n — m — m')\ by 
{n — m')™. So the above relation takes the form 

P(m, m'- 0) ~ (1 - ^)'™ ~ g-W/n^ 

Thus the probability of two complexes being connected will be 1 — P{'m^m';{)) and the 
average number of edges will be: 

E = \Y. S{m)S{m'){l - P{m, m'- 0). (7) 

m,m' 

Moreover the average number of edges with weight w is given by: 

E{w) = -Y. S{m)S{m')P{ m,m';w). (8) 

m,m' 

Figure (jUJ-a) shows this quantity for a model network with the same parameters as the 
yeast protein complex network. We see that the weight distribution of edges decreases 
exponentially in contrast to the power law behavior of the protein complex network. This 
means that in the yeast protein complex there are edges with very high weights. These high 
weight edges inevitably connect high weight complexes. Thus in the yeast complex network, 
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FIG. 6: Comparison of the S. cerevisiae protein complex network (squares) with the analytic 
predictions of the random-selection model (lines): (a) weight distribution of edges and (b) degree 
of a complex versus its weight 

the hubs (complexes with high degree) are connected to each other intensively. This is in 
contrast with the protein interaction network where the high degree proteins have a 
larger tendency to be connected to low degree ones. 

In the same way one obtains k{m), the average number of neighbors of a given complex of 
weight m: 

k{m) = J2S{m'){l-P{m,m';0)) (9) 

m' 

In figure (jOl-b) we compare the above expression with the one obtained from the real 
network. One finds a good agreement between predictions of the model and the real 
data. From the distribution of weights of complexes S{m) and the above relation, one can 
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obtain the distribution of degrees P{k) using the relation P{k)Ak = S{m)Am. Thus we 
expect that degree distribution of the random-selection model to be also close to the real one. 



What can be said about r{q) := — the probability that a given protein contributes in 
q complexes? It is not difficult to show that due to the fully random nature of the selection 
process, the above distribution has a binomial form like 

Nfn 1 1 _ 

r{q) = { )(-)"(! --f™-" (10) 
q n n 

where fn is the average weight of complexes. Thus for very large values of and n we have 

r(q) = -re"^, A = —fn. (11) 
q] n 

Therefore we get a Poisson distribution for coordination numbers of proteins in distinctive 
contrast to the real distribution which is scale free. 



As seen from figure (jH)), our analytical treatment of this model revealed that some of the 
general characteristics (the degree of a given complex as a function of its weight and so the 
degree distribution of complexes) of these networks are close to those of S. cerevisiae protein 
complex network. Further results derived from numerical simulations given in table (jT]) show 
that random-selection model is rather close to the yeast protein complex network. 
Note however that certain discrepancies between the random-selection model and the real 
complex network, i.e. in the distribution of the weights of edges, persist even if we improve 
this model by fixing the coordination number of proteins from the beginning to be the 
same as the one derived from real data of the protein complex network (see figure (jZj)). The 
evolutionary model that we introduce in the next section is aimed to remove this discrepancy. 



IV. THE EVOLUTIONARY MODEL 

In this section we introduce a simple model aimed at representing the evolution of protein 
complex network. The model is based on the extension of the hypothesis of evolution by 



duplications and divergence of proteins 
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to the level of protein complexes. DNA 



duplication has long been known as an important factor in the evolution of genome size. 
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FIG. 7: Comparison of the S. cerevisiae protein complex network (squares) with predictions of 
the improved random-selection model (circles): (a) degree distribution of complexes (b) weight 
distribution of edges 

This process almost certainly explains the presence of large families of genes with related 
functions in biological complex organisms. Consider a set of proteins which belong to a 
number of functional units or complexes. In each evolutionary step, the following actions 
take place. A protein is randomly chosen and duplicated. It means that a new protein is 
added to the proteome and contributes to all the complexes in which its mother participates. 
Then the new protein undergoes mutations; it loses its membership in any given complex 
with probability /Xo, and enters any other complex with probabihty /ij. It is also probable 
that the protein creates a new complex (novel functionality) by its own with probability Hc- 
In figure (jH)) we have shown what happens in a typical step of network evolution. Note that 
during this process the new protein may not contribute to any complex and thus leaves the 
proteome. Starting from one protein in one complex, we have taken t = 1500 evolutionary 
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FIG. 8: A typical step of the evolutionary model in which protein 1 has been duplicated and edges 
of the new protein (protein 2) undergone mutations. 



steps. We found that it was the best number of steps which could intimately produce the 
desired results. We found that in this situation only about 7 percent of duplicated proteins 
are stuff and the others takes part at least in one complex. Having t we will also have the 
probability of creating a new complex in each step, because N the number of complexes 
satisfies = 1 + ^ct- The other parameters can also be determined by requesting that 
the number of proteins and some important distributions (e.g. weight distribution of com- 
plexes) be as close as possible to the real data. Moreover, note that we also expect that 
the probability of entering a new complex for a duplicated protein must be much smaller 
than the probability of exiting one of its inherited complexes. Summing up these points we 
find, by fitting to the real data, the following values for parameters of the model: /Iq ~ 0.4 
, /ij ~ 0.01 and fic — 0.154. One can find results of this model in figure © and table (jT)). 
In figure (jHl-a) we see that the model closely reproduces the scale free behavior of the weight 
distribution of complexes in the region of large weights although there are some deviations 
from the real data for the number of small complexes. Indeed we expected such a power 
law behavior in advance due to the presence of preferential attachment l|] in the process 
of evolution of complexes. Note that although the selection of proteins for duplication is 
a completely random procedure, complexes with higher weights have a higher chance to 
include the new duplicated protein. In this way the evolution of complexes is governed by 
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FIG. 9: Comparison of the yeast protein complex network (squares) with the evolutionary model 
(circles): (a) weight distribution of complexes (b) weight distribution of edges The evolutionary 
model distributions are results of averaging over 2000 runs of the evolutionary process. The asso- 
ciated relative statistical errors are in order of a few percent. 

the well known mechanism of preferential attachment which is a principal way to produce 
scale free behaviors in the realm of complex networks^. 

Figure (jHl-b) shows the weight distribution of edges in the resulted protein complex network 
of the evolutionary model. The agreement with the real data is excellent. It is one of the 
essential features of protein complex network which none of the random models studied in 
this paper could exhibit it. We believe that this fact indicates to the essential role of dupli- 
cation and divergence of proteins in the evolution of the protein complex network. 
Moreover as table (jl} shows the evolutionary model results in a small world protein complex 
network. Both the transitivity and diameter of the model are comparable with the values 
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of the real network of complexes. 

This model also exhibits a httle tendency for being a negative correlated network in weights 

and degrees of neighboring complexes. Indeed the related correlation coefficients turn out 

to be = —0.05 ±0.001 and r^k = —0.08 ±0.002. It means that complexes with different 

weights and degrees have a larger probability to be connected to each other. 

Finally it is worthwhile to note that the data studied in this paper are those of the yeast 

proteome whereas the results given in this section are the average behavior of such a pro- 

teome. 

V. CONCLUSION 

To summarize, we have shown that the protein complex network of the yeast is a nearly 
uncorrclated small world scale free network. The power law behavior was also found in 
other significant distributions such as weight of complexes, weight of edges and coordination 
number of proteins. Although some of these features (e.g. small world property and high 
clustering) were expected in advance, this study revealed some distinctive properties of this 
network hke the scale free behavior of weight distribution of edges. We also compared the 
yeast protein complex network with a random-selection model and a simple evolutionary 
model. It was found that the random-selection model can satisfactorily reproduce the 
relation between weight and degree of a complex and also degree distribution of the real 
network. However this model failed to give the power law behavior of the distribution of 
weights of edges, a property which could be well reproduced in the evolutionary model. 
In the latter model the desired distributions automatically arise by just fitting the model 
parameters using the real data. 

From the evolutionary point of view, the above study can also be a hint to the essential 
role of duplication and divergence processes in the evolution of proteome and this study is 
indeed an extension of this mechanism to the level of protein complexes. Certainly results 
of the previous studies on protein interaction network along with the investigation of large 
scale properties in the protein complex network will help in a better understanding of the 
general behaviors of such systems. 
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ER Model 


RS Model 
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Evolutionary Model 


N 
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232 


232 


232 


232.7±0.3 


n 
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1398 


1364.7±0.4 


k 
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17.6 
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23.8±0.05 


29.57±0.06 




31.62 (45.73) 
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31.62 
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Q 
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1.91 


2.73±3 * 10-3 


T 


0.41 
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0.29±10-''^ 


D 


5 
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4.4zb0.1 


4.4zb0.1 


4.02±4 * 10-3 


1 


0.35(0.27) 


0.49±10-^ 


0.47±10-3 


0.48±10-3 


0.54±10-^ 



TABLE I: Comparison of the yeast protein complex network (PCN) with some models: Erdos and 
Renyi (ER) random graph, the random-selection model (RS), improved random-selection model 
and the evolutionary model. The first column represents: number of complexes (N), number of 
proteins (n), average of degree (k), weighted degree {kw), coordination number of proteins (q), 
transitivity (T), diameter of network (D) and inverse of shortest distance between two arbitrary 
nodes (l/c/jj). Values in parenthesis refer to dispersion of associated quantity. Statistical errors 
have been denoted where we have done averaging over different realizations. 

[26] RErdos and A.Renyi, Publ. Math. Inst. Hung. Acad. Sci., 5,17 (1960). 
[27] M.E. J.Newman, Rhys. Rev. E 68, 026121 (2003). 

[28] M.E.J.Newman, S.H.Strogatz and D.J.Watts, Rhys. Rev. E 64, 026118 (2001). 



19 



